Independent Automatic Segmentation of Speech by Pronunciation Modeling

نویسندگان

  • Nicole Beringer
  • Florian Schiel
چکیده

In this paper we present an iterative automatic segmentation system which does not require any domain dependent training data. Input to the system is the canonical pronunciation and the speech signal of an utterance to be segmented, as well as a set of phonological pronunciation rules. The output is a string of phonetic labels (SAM−PA[1]) and the corresponding segment boundaries of the speech signal. The system consists of three main parts: In a first stage a set of general phonological rules is applied to the canonical pronunciation of an utterance yielding a graph that contains the canonic form and presumed variations. In a second HMM−based stage the speech signal of the concerning utterance is time−aligned to this graph using a Viterbi search. The outcome of this stage is the time−aligned transcription of the input utterance. Using this "raw" application of the phonological rules as the baseline in a third stage, a new set of statistically weighted rules is derived. The procedure is repeated iteratively until the segmentation is not changed anymore.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pronuncation modeling applied to automatic segmentation of spontaneous speech

In this paper two di erent models of pronunciation are presented: the rst model is based on a rule set compiled by an expert, while the second is statistically based, exploiting a survey about pronunciation variants occurring in training data. Both models generate pronunciation variants from the canonic forms of words. The two models are evaluated by applying them to the task of automatic segme...

متن کامل

Independent automatic segmentation by self-learning categorial pronunciation rules

The goal of this paper is to present a new method to automatically generate pronunciation rules for automatic segmentation of speech the German MAUSER system. MAUSER is an algorithm which generates pronunciation rules independently of any domain dependent training data either by clustering and statistically weighting self-learned rules according to a small set of phonological rules clustered by...

متن کامل

Pronunciation Modeling Applied to Automaticsegmentation of Spontaneous

In this paper 1 two diierent models of pronunciation are presented: the rst model is based on a rule set compiled by an expert, while the second is statistically based, exploiting a survey about pronunciation variants occurring in training data. Both models generate pronunciation variants from the canonic forms of words. The two models are evaluated by applying them to the task of automatic seg...

متن کامل

A study of implicit and explicit modeling of coarticulation and pronunciation variation

In this paper, we focus on the modeling of coarticulation and pronunciation variation in Automatic Speech Recognition systems (ASR). Most ASR systems explicitly describe these production phenomena through context-dependent phoneme models and multiple pronunciation lexicons. Here, we explore the potential benefit of using feature spaces covering longer time segments in terms of implicit modeling...

متن کامل

Speech is like a box of

Pronunciation variability is present in both native and foreign words. Since pronunciation variability constitutes a problem for automatic speech recognition (ASR) systems, modeling pronunciation variation for ASR has been the topic of various studies. In most studies, modeling pronunciation variation was attempted within the standard framework used in mainstream ASR systems. Given that some as...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999